In this paper a global reactive motion planning framework for robotic manipulators in complex dynamic environments is presented. In particular, the circular field predictions (CFP) planner from Becker et al. (2021) is extended to ensure obstacle avoidance of the whole structure of a robotic manipulator. Towards this end, a motion planning framework is developed that leverages global information about promising avoidance directions from arbitrary configuration space motion planners, resulting in improved global trajectories while reactively avoiding dynamic obstacles and decreasing the required computational power. The resulting motion planning framework is tested in multiple simulations with complex and dynamic obstacles and demonstrates great potential compared to existing motion planning approaches.
translated by 谷歌翻译
As the accuracy of machine learning models increases at a fast rate, so does their demand for energy and compute resources. On a low level, the major part of these resources is consumed by data movement between different memory units. Modern hardware architectures contain a form of fast memory (e.g., cache, registers), which is small, and a slow memory (e.g., DRAM), which is larger but expensive to access. We can only process data that is stored in fast memory, which incurs data movement (input/output-operations, or I/Os) between the two units. In this paper, we provide a rigorous theoretical analysis of the I/Os needed in sparse feedforward neural network (FFNN) inference. We establish bounds that determine the optimal number of I/Os up to a factor of 2 and present a method that uses a number of I/Os within that range. Much of the I/O-complexity is determined by a few high-level properties of the FFNN (number of inputs, outputs, neurons, and connections), but if we want to get closer to the exact lower bound, the instance-specific sparsity patterns need to be considered. Departing from the 2-optimal computation strategy, we show how to reduce the number of I/Os further with simulated annealing. Complementing this result, we provide an algorithm that constructively generates networks with maximum I/O-efficiency for inference. We test the algorithms and empirically verify our theoretical and algorithmic contributions. In our experiments on real hardware we observe speedups of up to 45$\times$ relative to the standard way of performing inference.
translated by 谷歌翻译
Federated Learning (FL) is a scheme for collaboratively training Deep Neural Networks (DNNs) with multiple data sources from different clients. Instead of sharing the data, each client trains the model locally, resulting in improved privacy. However, recently so-called targeted poisoning attacks have been proposed that allow individual clients to inject a backdoor into the trained model. Existing defenses against these backdoor attacks either rely on techniques like Differential Privacy to mitigate the backdoor, or analyze the weights of the individual models and apply outlier detection methods that restricts these defenses to certain data distributions. However, adding noise to the models' parameters or excluding benign outliers might also reduce the accuracy of the collaboratively trained model. Additionally, allowing the server to inspect the clients' models creates a privacy risk due to existing knowledge extraction methods. We propose CrowdGuard, a model filtering defense, that mitigates backdoor attacks by leveraging the clients' data to analyze the individual models before the aggregation. To prevent data leaks, the server sends the individual models to secure enclaves, running in client-located Trusted Execution Environments. To effectively distinguish benign and poisoned models, even if the data of different clients are not independently and identically distributed (non-IID), we introduce a novel metric called HLBIM to analyze the outputs of the DNN's hidden layers. We show that the applied significance-based detection algorithm combined can effectively detect poisoned models, even in non-IID scenarios. We show in our extensive evaluation that CrowdGuard can effectively mitigate targeted poisoning attacks and achieve in various scenarios a True-Positive-Rate of 100% and a True-Negative-Rate of 100%.
translated by 谷歌翻译
安全关键系统通常在调试之前进行危害分析,以识别和分析操作过程中可能出现的潜在危险系统状态。当前,危害分析主要基于人类的推理,过去的经验以及清单和电子表格等简单工具。增加系统复杂性使这种方法非常合适。此外,由于高成本或身体缺陷的危险,基于测试的危害分析通常不适合。对此进行的补救措施是基于模型的危害分析方法,这些方法依赖于正式模型或模拟模型,每个模型都具有自己的好处和缺点。本文提出了一种两层方法,该方法使用正式方法与使用模拟的详细分析结合了详尽分析的好处。首先使用监督控制理论从系统的形式模型中合成了导致不安全状态的不安全行为。结果是输入到模拟的输入,在该模拟中,使用域特异性风险指标进行了详细的分析。尽管提出的方法通常适用,但本文证明了该方法对工业人类机器人协作系统的好处。
translated by 谷歌翻译
图形数据库(GDB)启用对非结构化,复杂,丰富且通常庞大的图形数据集的处理和分析。尽管GDB在学术界和行业中都具有很大的意义,但几乎没有努力将它们与图形神经网络(GNNS)的预测能力融为一体。在这项工作中,我们展示了如何无缝将几乎所有GNN模型与GDB的计算功能相结合。为此,我们观察到这些系统大多数是基于或支持的,称为标记的属性图(LPG)的图形数据模型,在该模型中,顶点和边缘可以任意复杂的标签和属性集。然后,我们开发LPG2VEC,这是一种编码器,将任意LPG数据集转换为可以与广泛的GNN类直接使用的表示形式,包括卷积,注意力,消息通话,甚至高阶或频谱模型。在我们的评估中,我们表明,LPG2VEC可以正确保留代表LPG标签和属性的丰富信息,并且与与图形相比,与与图形相比,它提高了预测的准确性,而不管有针对性的学习任务或使用过的GNN模型,多达34%没有LPG标签/属性。通常,LPG2VEC可以将最强大的GNN的预测能力与LPG模型中编码的全部信息范围相结合,为神经图数据库铺平了道路,这是一类系统,其中维护的数据的绝大复杂性将从现代和未来中受益图机学习方法。
translated by 谷歌翻译
成倍增长的模型大小驱动了深度学习的持续成功,但它带来了过度的计算和记忆成本。从算法的角度来看,已经研究了模型的稀疏和量化以减轻问题。从体系结构的角度来看,硬件供应商提供了张量核心以进行加速。但是,由于严格的数据布局要求以及缺乏有效操纵低精度整数的支持,因此从稀疏的低精度矩阵操作中获得实践加速非常具有挑战性。我们提出了Magicube,这是一个高性能的稀疏矩阵库,用于张量芯上的低精度整数。 Magicube支持SPMM和SDDMM,这是深度学习的两个主要稀疏操作。 NVIDIA A100 GPU的实验结果表明,Magicube平均在供应商优化的库中平均达到1.44倍(高达2.37倍)的速度,用于稀疏内核,而在最先进的艺术品上进行了1.43倍的速度,具有可比的准确性。端到端稀疏变压器推断。
translated by 谷歌翻译
许多微体系式优化为深度神经网络解锁了巨大的处理能力,从而促进了AI革命。随着这种优化的精疲力尽,现代AI的增长现在是通过培训系统的性能,尤其是其数据流动的。我们没有专注于单个加速器,而是研究了全系统规模的大规模培训的数据移动特征。基于我们的工作量分析,我们设计了HammingMesh,这是一种新颖的网络拓扑,以低成本提供高的带宽,并具有很高的工作计划灵活性。具体而言,HammingMesh可以支持具有两个并行性的两个维度的深度学习培训工作的完整带宽和隔离。此外,它还为通用流量的高全球带宽提供支持。因此,HammingMesh将为未来的大规模深度学习系统供电,并具有极端的带宽要求。
translated by 谷歌翻译
我们提出了一种在线和数据驱动的不确定性量化方法,以实现安全的人类机器人协作应用程序的开发。安全性和系统的风险评估与测量的准确性密切相关:通常无法通过已知模型直接访问独特的参数,因此必须测量。但是,由于传感器的性能有限,甚至未知的环境干扰或人类,测量值通常会遭受不确定性的影响。在这项工作中,我们通过利用具有定量的,系统特定属性的保护措施来量化这些测量不确定性,这些措施会随时间,空间或其他状态空间维度恒定。我们方法的关键思想在于在运行时间参考保护方程式期间对传入数据的直接数据评估。特别是,我们估计违反已知的域名特定域保护特性的行为,并将其视为测量不确定性的结果。我们在人类机器人协作的背景下验证了用例验证我们的方法,从而强调了我们在现实环境下(例如在工业环境中)成功开发安全机器人系统的贡献的重要性。此外,我们还展示了如何将获得的不确定性值直接映射到任意安全限制(例如ISO 13849),该限制允许在运行时监视符合安全标准的符合性。
translated by 谷歌翻译
折叠服装可靠,有效地是由于服装的复杂动力学和高尺寸配置空间,在机器人操作中是一项漫长的挑战。一种直观的方法是最初在折叠之前将服装操纵到典型的平滑配置。在这项工作中,我们开发了一种可靠且高效的双人系统,将用户定义的指令视为折叠线,将最初弄皱的服装操纵为(1)平滑和(2)折叠配置。我们的主要贡献是一种新型的神经网络体系结构,能够预测成对的握把姿势,以参数化各种双人动作原始序列。在从4300次人类注销和自我监督的动作中学习后,机器人能够平均从120年代以下的随机初始配置折叠服装,成功率为93%。现实世界实验表明,该系统能够概括到不同颜色,形状和刚度的服装。虽然先前的工作每小时达到3-6倍(FPH),但SpeedFolding却达到30-40 FPH。
translated by 谷歌翻译
AR/VR应用程序和机器人需要知道场景何时更改。一个示例是从场景中移动,添加或删除对象时。我们提出了仅基于场景更改的3D对象发现方法。我们的方法不需要编码有关对象的任何假设,而是通过利用其连贯的动作来发现对象。最初将变化视为深度图的差异,并在对象进行刚性运动时被分割为对象。图切割优化将不断变化的标签传播到几何一致的区域。实验表明,我们的方法在针对竞争基线的3RSCAN数据集上实现了最先进的性能。我们方法的源代码可以在https://github.com/katadam/objectscanmove上找到。
translated by 谷歌翻译